1. This Appendix provides an overview of how electronic documents are created, managed and stored, and identifies some of the questions, challenges and opportunities that arise in the production of electronic documents as a result of its unique characteristics.

2. To address electronic document production issues that may arise, it is useful to have a basic understanding of how a typical office computer network works and where electronic documents responsive to a document request may potentially be located and retrieved. Electronic documents may potentially be located either in relatively accessible 'active' sources or in less accessible 'inactive' back-up, fragmented or deleted sources.

3. 'Active' electronic documents—which ordinarily should be the sole source of production of any electronic documents in international arbitration-are generally stored in a readily usable format and are relatively easy to access. 'Inactive' electronic documents are generally harder and more expensive to access and produce. The diagram below illustrates a simplified office computer network and the hardware components that may be used to create, manage and store electronic documents, and the sections that follow describe the 'active' and 'inactive' sources of electronic documents, as well as the data about electronic documents known as 'metadata'

Diagram of a Typical Office Computer Network

A. 'Active' electronic documents

4. Personal computers: At a basic level, when a human being (a 'user' or document 'custodian') sits down at his or her 'workstation' or desk and writes an email, drafts a word-processing document, populates an electronic spreadsheet (i.e. creates, stores or manipulates electronic documents), he/she does so on a 'personal computer' (or 'PC')-either a 'desktop' or 'laptop' computer. The PC will typically have a 'local' hard drive where electronic documents created on that PC may be electronically located, stored and accessible only through that specific computer. The kinds of electronic documents that may exist on the local hard drive of a personal computer include virtually any variety of document or file a person can create using today's vast array of computer software 'applications'. In a business context, these will most commonly include word-processing files, emails, spreadsheets, slide presentations - the familiar array of files that most office PCs are equipped to create.

5. Electronic documents created on any given user's PC may also be located wherever else those electronic documents may have been sent by the user, or otherwise automatically stored. For example, when a user sends an email, it will typically be recorded in the sender's 'sent' box; it will also appear in the 'in' box of one or many recipients. The email's recipients in turn may have forwarded the same email to other recipients. On a network, a copy of the email may reside on the PC hard drives of the sender and/or one or more recipients. As described below, it may also be recorded on a shared server, as well as a 'personal digital assistant' ('PDA') device such as a Blackberry, which replicates each user's email remotely. The user's PC becomes an input and viewing device for electronic documents located on a server elsewhere.

6. Shared servers: To maximize computer efficiency and to promote office interconnectedness, each user/custodian's PC located at his or her desk will typically be part of a 'network' of many such workstation PCs. A network of individual PCs will usually be constructed around a number of shared 'server' computers, which constitute a second potential source of 'active' electronic documents. A 'server' computer is a separate computer from the PC computer located on each user's desk, located and operated centrally, that contains software to perform specific functions for all of the PCs (or 'clients') in the network that are interconnected by the server. One example would be a shared email server. Office computer networks will typically have one or more servers that do nothing but 'serve' the network's email needs for all of the users in the network. Thus, instead of each user having their own emails created, sent, received and stored on their individual PC at their desk, all of the emails created, sent, received and stored by all users in the network will reside on one or more shared server computers that do nothing but process emails. The result is that a particular user's email, which they access from their PC, may not actually be located on their individual PC, but instead may be located entirely or in large part on a shared server computer, physically located somewhere else, to which their PC is connected, and which centrally provides email for all network users whose PCs are connected to that email server.

7. The shared functions within an office computer network that may be performed by servers rather than by each individual user's PC can be just about anything. Servers may be dedicated to providing network user access to the Internet, providing internal kinds of messaging applications separate from email, managing the printing of all documents in the office, managing and storing electronic documents created by Blackberry or other handheld devices, or maintaining other kinds of network electronic documents, like all word-processing files or all accounting or staff personnel files in a centrally accessible location. Having dedicated 'servers' perform different aspects of a company's business on behalf of all PCs in a network enables centralization and administrative control over a company's electronic documents, whereby a company's 'Information Technology' ('IT') officer or department can monitor use, or assign or withhold different user access rights, for instance by limiting the number of users/custodians who may access the server containing the company's staff personnel files or other sensitive need-to-know data.

8. When shared servers are used to create an office network, electronic documents generated by an individual network user at his or her PC may actually be stored in a shared server computer located somewhere else, which that individual user's PC accesses for the particular kind of electronic documents involved-email, word-processing, accounting data, and so on-rather than located on the hard drive of the individual user's desktop or laptop PC. Consequently, a file created by one user in the network may be equally accessible to all or several other users in the network, who can equally access, copy, modify, delete, overwrite or send a particular document or file that was created by another user on the shared server.

9. 'Legacy computers': Sometimes a company's network of computers may include 'legacy computers', i.e. outdated computer hardware containing antiquated software or data that is still necessary to perform certain aspects of the company's business. For example, a company may have an antiquated or custom-designed accounting system, which resides on a specific computer that can still support the accounting application the company has been using for many years. The speed at which new hardware and software is introduced on the market to replace, supplement or update a company's existing computer systems may often result in piecemeal changes and updates to an existing computer network, requiring phased integration of old and new hardware and software over an extended period of time. Electronic documents created and stored on legacy computers may only be located on one or more specific legacy computers in the network and may not be accessible or readable on any of the company's other computers since other forms of hardware do not support the data concerned. Such legacy electronic documents may be difficult to retrieve and produce in an accessible format if the outdated hardware or software used to create those documents is required to read and use them. Many of the major software providers enable new software releases to be 'backward-compatible' (i.e. able to access electronic documents produced on earlier versions of the software) to a point. However, this is not always the case, for example if a company is operating a bespoke system.

10. PDAs: In addition to serving individual PC and laptop workstations connected within the office network, specific servers may also serve other kinds of client computers, such as personal digital assistants-an ever-expanding array of portable handheld devices, including BlackBerries, smart phones and Palm Pilots-as well as remote PCs or laptops maintained by company employees at home or otherwise outside the office. All of these remote or wireless devices are capable of creating, managing and storing electronic documents in their own right, and remotely accessing, altering, deleting and exchanging electronic documents located on the office network servers via an Internet connection. PDAs will typically contain hard drives of their own where electronic documents may also be stored, in addition to accessing remotely electronic documents that are located on the office network. However, in many cases, any document received, created or sent by a PDA will be automatically synchronized back to a central server.

11. Third-party PCs, servers, and data rooms: Electronic documents may also be found on third-party servers or PCs, completely external to a company's network. For example, data and files may be sent to and then stored on a third-party's PC or network through a 'file transfer protocol' ('FTP') client, which allows easy transfer of large amounts of data and files through the Internet. Electronic documents might also be found on an external server in a virtual 'data room'. Data rooms usually are password-protected, but are often accessible to multiple parties through the Internet. Data rooms may be maintained by the company or a third-party vendor that provides off-site data storage and management, and can contain data and files of both the company and third parties.

12. Removable media: Finally, 'active' electronic documents can also be stored on removable media, such as CDs, DVDs, disks, tapes, and USB portable drives. These compact storage devices for electronic information can be used on any computer. They effectively provide a removable and portable hard drive, capable of storing any of the same kinds of electronic documents that a PC can generate, and may be located virtually anywhere-in the office, at home, in a car, in a briefcase, a pocket, with a third party, etc.

B. 'Inactive' electronic documents

13. In addition to the foregoing components of a typical 'active' computer network, business computer networks typically will also include 'archived' or 'inactive' electronic documents. Such inactive electronic documents can be located on the same clients and servers that are part of a company's active network, or on dedicated back-up servers or removable disks or tapes, which are maintained separately from the active network so as to protect and preserve electronic documents that can be vital to a business's survival in the event that a catastrophe compromises the company's active computer system and the electronic documents it contains. Depending on a company's business practices, archived electronic documents may be stored within an organized structure. In contrast, back-up servers or tapes will typically maintain a 'snapshot' of all or specific portions of a company's active electronic documents, taken on a periodic basis to preserve the company's data in the event of catastrophic system failure, loss or damage such as may be caused by a fire, earthquake, virus contamination, or other core threats to a company's business. Back-up servers or tapes should not, therefore, be expected to provide a comprehensive set of all of a business's electronic documents. Furthermore, back-up electronic documents are typically not maintained in a format that is readily accessible or searchable, as they are intended only for disaster-recovery purposes. Typically, back-up tapes are not well structured. Therefore, it is usually necessary to restore the entire tape or collection of tapes (at considerable expense) in order to investigate only a small part that may be relevant to a particular dispute.

14. 'Deleted' electronic documents: "Deleted" electronic documents are another form of inactive electronic documents. 'Deleted' is a misnomer insofar as 'deletion' of a document or file on a computer may serve only to move it from one location (e.g. an email inbox) to another (e.g. an email 'trash' or 'recycle' folder) where it remains and can readily be retrieved for some period of time. When an electronic document is then deleted from a trash or recycle file, that typically means only that the digital storage space required to maintain that particular electronic document has been designated as available for the storage of different information as and when the computer automatically determines that the same space is needed to store new or different information. But the deleted item continues to reside on the computer until it is overwritten with new and different information. This can result in 'fragmented' files, as computers will move and divide data designated as deleted in order to efficiently make room for new data. However, computer forensic techniques exist even to retrieve deleted electronic document long after it has been designated as such.

15. It is also worth noting that even when an electronic document is deleted from one location on one computer, PDA or storage device, an identical copy may continue to exist somewhere else on a company's computer system. For example, if the sender of an email deletes the email from his or her sent box at work, the email may continue to exist in a multitude of other email folders of other network users.

C. Metadata

16. "Metadata" is, literally, data about (electronically stored) data. Documents or files created on a computer will typically contain embedded information that is not readily apparent on the screen view of a file or in a printed version of the document or file. This secondary metadata is information about the electronic document or file that describes its characteristics, origins, or usage. There are three basic categories of metadata:

(i) 'Substantive' (or 'application') metadata is created by the software used to create the document, and reflects (among other things) editing changes or comments made to the document over time. Substantive metadata is embedded in the document it describes-and therefore remains with the document when it is copied, moved or produced-and may be useful in showing the genesis of a document and the history of proposed and/or accepted revisions to the document.

(ii) 'Systems' metadata reflects automatically generated information about the creation or revision of a document, such as the document's author or the date and time of its creation, modification or delivery. Systems metadata is not necessarily embedded in the document but can be generated by the computer system on which the document was created, and can be relevant if a document's authenticity is at issue or there are issues as to who received a document (including blind copy recipients that do not appear on the face of a document) or when it was received.

(iii) 'Embedded' metadata is inputted into a document by its creator or users but cannot be seen in the document's display, and commonly includes the formulas used to create spreadsheets, hidden columns, references, fields or linked files. Embedded metadata can be critical to understanding complex spreadsheets (such as those often used, for example, in construction projects) which on their face do not explain the mathematical formulas underlying or relating to the various rows or columns of information that are displayed on a computer screen or a printed version of the spreadsheet.

17. It should be noted that 'visible' metadata should be distinguished from 'hidden' metadata. Visible metadata is commonly displayed on screen and/or in print-outs and hidden metadata is not. In the case of an email, strictly speaking, all its constituent fields are metadata. Examples of visible metadata include the to/from/cc/date/title fields. Examples of hidden metadata would include the route the email took over the Internet and the IP address from which it was sent. Most of the metadata mentioned in sub-paragraphs (i) to (iii) above is hidden metadata.

18. Metadata is most commonly produced either in (a) a pdf or tagged image format (TIFF) with an accompanying 'load file' which permits the recipient to search the document for the relevant metadata, or (b) in the 'native' format in which the document being produced was created and which provides the recipient with all of the information available to the original user.